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1 INTRODUCTION 



There has been a great deal of interest in mapping diffuse Galac- 
tic radio emission, both to better understand our Galaxy and 
to clean out foreground contamination from Cosmic Microwave 
Background (CMB) maps. For reviews of such issues, see, e.g., 
from 111) to |23|) . and references therein. Because much of this work 
was both motivated by the CMB and based on maps from CMB ex- 
periments, it has focused primarily on frequencies above 10 GHz. 
Now that Neutral Hydrogen Tomography (NHT) is emerging as a 
promising cosmological tool where we can map the high-redshift 
universe three-dimensionally via the redshifted 21 cm line - see, 
e.g., from 1241) to (|44) ~ it is timely to extend these efforts down to 
lower frequencies. The redshift range 7 > jz > 50 where NHT may 
be feasible corresponds to the frequency range 30 — 180 MHz. In- 
deed, accurate foreground modeling is arguably even more impor- 
tant for NHT than for CMB: whereas unpolarized CMB fluctua- 
tions dominate over foregrounds for the most favorable frequencies 
and sky directions, and the situation for polarized CMB fluctuations 
is only 1-2 orders of magnitude worse, the cosmological neutral hy- 



ABSTRACT 

Understanding diffuse Galactic radio emission is interesting both in its own right and for mini- 
mizing foreground contamination of cosmological measurements. Cosmic Microwave Back- 
ground experiments have focused on frequencies > 10 GHz, whereas 21 cm tomography of 
the high redshift universe will mainly focus on < 0.2 GHz, for which less is currently known 
about Galactic emission. Motivated by this, we present a global sky model derived from all 
publicly available total power large-area radio surveys, digitized with optical character recog- 
nition when necessary and compiled into a uniform format, as well as the new Villa Elisa data 
extending the 1 .42 GHz map to the entire sky. We quantify statistical and systematic uncer- 
tainties in these surveys by comparing them with various global multi-frequency model fits. 
We find that a principal component based model with only three components can fit the 1 1 
most accurate data sets (at 10, 22, 45 & 408 MHz and 1.42, 2.326, 23, 33, 41, 61, 94 GHz) to 
an accuracy around 1%-10% depending on frequency and sky region. Both our data compila- 
tion and our software returning a predicted all-sky map at any frequency from 10 MHz to 100 
GHz are publicly available at http://space.mit.edii/home/angelica/gsm, 
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drogen signal is perhaps four orders of magnitudes smaller than the 
relevant foregrounds - see, e.g., from ( |4^ to ( |49|) . 

In due time, NHT experiments such as LOFAR Jsl: [3^; [STh. 
MWA 0; li^) and 21 CM A (40) will produce accurate low- 
frequency maps of Galactic emission much like CMB experiments 
have above 10 GHz. Even now, however, it is important to have a 
model for how this emission varies across the sky and across the 
NHT-relevant frequency band, because this helps optimize exper- 
imental design, scan strategy and data analysis pipelines to maxi- 
mize the scientific return on its investment. Even as new maps are 
made of small patches of sky, it is valuable to have a pre-existing 
global sky model to be able to quantify and mitigate contamination 
from distant sidelobes. 

As described in Section (2] the radio astronomy community 
has produced a large number of sky surveys over the years, many 
of which are directly relevant to the NHT frequency range. How- 
ever, most of them have never been used by cosmology community, 
because of various challenges: many are not publicly available in 
electronic form, and they are also hard to combine because they 
differ in sky coverage, angular resolution, pixelization and quality. 
Instead, a common approach among cosmologists has been to sim- 
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Figure 1. The maps show (from left to right, top to bottom) the 0.010 GHz BI), 0.0135 GHz Isl, 0.0175 GHz Jsi), 0.022 GHz 0.026 GHz (s^, 0.0345 
GHz 0.038 GHz (58), 0.045 GHz (63), 0.0815 GHz (55), 0.085 GHz (65), 0.150 GHz 0.176 GHz (H, 0.400 GHz (si), 0.404 GUz^, 0.408 
GHz fSO), 0.820 GHz (73), 1.42 GHz (74|;|7J;|7^ and 2.326 GHz (7g) surveys, and the CMB-free WMAP foreground maps at 23, 33, 41, 61 and 94 GHz 
(Il[ia.22,;ja)- 




Figure 2. Sky coverage/overlap, in Galactic coordinates, of the six key maps 
below the WMAP frequencies: the numbers from 1 to 6 correspond to 10, 
22, 45, 408, 1420 and 2326 MHz, respectively. 



ply extrapolate the 408 MHz Haslam map ( 1501) to lower frequen- 
cies, thus ignoring any spectral variation across the sky. As we will 
see, significantly better accuracy can be attained by modeling that 



includes additional data sets. The goal of the present paper is to col- 
lect, standardize and model this large body of radio data to make it 
more useful to the cosmology community. 

The rest of this paper is organized as follows. In Section[2l we 
describe how we compile all publicly available total power large- 
area radio surveys of which we are aware, digitizing them with opti- 
cal character recognition when necessary, and converting them into 
a uniform format. In Section [3] we compare different methods for 
constructing a global sky model from this data that covers the en- 
tire sky and the entire frequency range. In Section|4l we present the 
results of our modeling, quantify the accuracy of our best model, 
and briefly comment on implications for the physics underlying this 



2 DATA SETS 

In order to carry out our analysis, we performed a literature search 
for large-area total power sky surveys in the frequency range 1 
MHz to 100 GHz. The result of our search is shown in Figure [T] 
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Table 1. Available total power radio surveys. 
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A = Publicly available in digital form. 
B = Available on request. 
: Available as printed table (which we OCRed). 
D = Not available in any numerical form. 



and Table [T] Unfortunately, some of the surveys shown in Table [T] 
are not available in any numerical form. A minority of the surveys 
are publicly available in digital form (and/or) available on request, 
while many of the surveys are available only as printed tables which 
we converted to digital form using Optical Character Recognition 
(OCR). 

Some of the surveys shown in Figure[T]have a very large an- 
gular beam (the 0.0135, 0.0175, 0.026, 0.038, 0.0815, 0.176, 0.400 
and the 0.404 GHz maps), others are undersampled (the 0.085 and 



the 0.150 surveys), the 0.820 GHz survey is smoothed to 5° in its 
anti-centre area and, finally, one of them, the 0.0345 GHz map, it is 
missing large-scale structures and has severe striping that makes it 
unsuitable for use in our analysis. Therefore, all analysis presented 
in this paper is performed using the 0.010, 0.022, 0.045, 0.408, 
1.42, 2.326, 23, 33, 41, 61 and the 94 GHz maps - Figure [2] shows 
the sky coverage of these different surveys and how they overlap. 
They were all transformed to Galactic coordinates, pixelized using 
the HealPix RING scheme |83|) with resolution nside=512 (which 
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corresponds to 12 x 512^ — 3, 145, 728 equal-area pixels across 
the sky), and had the CMB component of 2.725 K subtracted. For 
the five 5-year WMAP maps used in this analysis, we removed 
their CMB component as described in {H;[23E] and then converted 
these maps to antenna temperature. Before performing our main 
analysis, we smooth all maps to a common final angular resolution 
of 5.1°. However, as described below, we also use full-resolution 
maps for some other purposes. 



3 METHODS 

A wide variety of models of Galactic emission have been used in 
the literature - see, e.g., ([l|;[l;0;B[i3) ™d references therein, and 
there are many additional popular fitting techniques that are purely 
statistical in nature and do not assume anything about the underly- 
ing physics. To maximize the utility of the available data sets and all 
the hard work that observers have invested into obtaining them, we 
explored a wide range of modeling methods before selecting one. 
Below we first present the criterion we will use for choosing the 
best modeling technique, then explore a range of methods to select 
the one that is most useful for our goal. The models we compare in- 
clude physics-inspired fitting functions, power laws, polynominals 
and splines as well as principal components. 

3.1 Criteria: accuracy and simplicity 

In this paper, our main criterion for chosing a method is accuracy. 
In other words, we wish to find the method that most accurately 
predicts the Galactic emission in any arbitrary sky direction and at 
any frequency between 0.010 to 94 GHz, independently of whether 
it is based on physical assumptions or is "blind" and purely statis- 
tical. In practice, we implement this criterion as follows: for each 
of the 1 1 frequencies where a high-quality sky map is available, we 
quantify how accurately a method can predict this map by using 
only information from the other 10 maps. 

Since the map used as the "truth" in each test may itself have 
noise and systematic errors, this procedure can overestimate the 
true errors. Moreover, our final Global Sky Model (GSM) uses all 
11 input maps jointly, not merely 10 at a time, and it is therefore 
more accurate than the model used in the test. For both of these rea- 
sons, the accuracy numbers we quote later on should be interpreted 
as conservative worst-case bounds on the actual accuracy. 

In addition to accuracy, we also desire simplicity. Specifically, 
it is valuable if the modeling method is simple and transparent 
enough to allow an analytical understanding of how the input deter- 
mines the output, especially if this makes it possible to characterize 
how noise and systematic errors propagate and affect the statistical 
properties of the resulting model. 

3.2 Method comparison 

3.2.1 Single-component models 

The Galactic InterStellar Medium (ISM) is a highly complex 
medium with many different constituents interacting through a 
multitude of physical processes. Free electrons spiraling around the 
Galactic magnetic field lines emit synchrotron radiation (85). For 

^ The new foreground-cleaned 5 year WMAP map and 
the five foreground-only maps are available on the web site 
|http://space.init.edii/home/angelica/gsm (bottom of page). 
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Figure 3. Comparison between the different GSM methods presented in 
Section 13.21 Squares show the 1 1 measurements available at a pixel at 
(/, 6)=(11.3°,89.6°). Lines show fits based on power-law-scaling of the 
Haslam map (straight solid/green line), a quadratic polynomial in log- 
log (dotted/blue curve), a cubic spline without the leftmost point (straight 
solid/black), and a 3-component PCA fit (straight solid/red). In the lower 
panel, the curves have been divided by the power-law-scaling of the Haslam 
map to make discrepancies between the methods even more visible. 

the lower frequencies where synchrotron radiation is expected to 
dominate the Galactic emission, a common approach in the liter- 
ature has been to simply scale the Haslam map ( 50) in frequency, 
usually with a power law 

where ?i is the unit vector pointing toward the i*** pixel in the 
map, 1/ is the frequency which this map is being scaled to, i/^ = 
408 MHz is the Haslam frequency, fi is the spectral inde]0, and T 
is the brightness temperature. However, the frequency dependence 
is known not to be a perfect power law: at higher frequencies, the 
slope of the synchrotron spectrum typically steepeno and other 
Galactic components such as free-free and dust emission begin to 
dominate. This suggests the use of a more general scaling of the 
type 

T{%,v)^T(v,,v,)f{v,), (2) 

where the spectrum /(i^i) is optimized by fitting to maps at other 
frequencies. We will quantify the accuracy of this approach in Sec- 
tion |4] The main problem with it is that the foreground frequency 



^ The straight green line in the upper panel of Figure [3] shows the case 
P = -2.8. 

One expects a spectral steepening towards higher frequencies, corre- 
sponding to a softer electron spectrum (>86|); Fig 5.3 in |87|) ). A recent anal- 
ysis done at 22 M Hz 1571) shows that /3 varies slightly over a large frequency 
range dHHSIil^ 
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dependence is known to vary across the sky. This occurs both be- 
cause the synchrotron spectral index /3 depends on the energy distri- 
bution of relativistic electrons ( 85), which varies somewhat across 
the sky, and also because the ratio of synchrotron to dust and other 
emission components can vary from place to place. In contrast, 
equation ^ assumes that a map of Galactic emission looks the 
same at all frequencies except for an overall change in amplitude. 

3.2.2 Polynomial and spline fitting 

Now that so much data is available, it is tempting to allow much 
more general fitting functions such as polynomials or cubic splines. 
We tested both of these approaches here and found that they gave 
their most accurate results when fitting in log-log (when fitting Ig T 
as a function of Ig 1/ rather than using T and/or v directly), since the 
function to be fit is then rather smooth - see Figure [3] (top panel). 
For instance, the quadratic polynomial fit 

In T{r,, v) = + /3(?) In ^ + 7(?.) (in ^) (3) 

generalizes equation Q to a position-dependent "running" 7 of the 
spectral index /3. For a given pixel i, let rrii denote the number of 
surveys that have observed it (6 ^ rrii ^5 11). Re-writing equa- 
tion l[3]l in a matrix form, we obtain 

y = Ax + n, (4) 

where y is an -dimensional vector that contains (the logarithm 
of) the temperatures at the i*'' pixel at the survey frequencies, 
A is an mi x 3 matrix that encodes the frequency dependence, and 
X is a 3-dimensional vector that contains the a, /? and 7 values at 
the i*'' pixel. The extra term n denotes noise in the broadest sense 
of the word, i.e., receiver noise, uncorrected offsets and calibration 
errors, and any other systematic effects or other non-sky signals 
present in the survey maps. This is an overdetermined system of 
linear equations since we always have rrii > 3 input maps avail- 
able, and assuming that the noise has zero mean, i.e., (n) — 0, the 
minimum-variance estimator for x is 

x= [A*N"^A]"^ A*N~V, (5) 
with covariance matrix 

S=[A^N-^A]"\ (6) 

where N is the noise covariance matrix (nn* ) . In Figure[3] we have 
simply approximated N by the diagonal matrix with Nu given by 
the rms of the i^^ map (we find the recovered maps to be rather 
insensitive to the choice of N). By performing this calculation for 
all the pixels in the sky, we obtain all-sky maps of the quantities a, 
13 and 7. Finally, to obtain a sky map at any frequency v, we simply 
use these values of a, P and 7 in equation l|3j. 

We also tried the approach of fitting the (log-log) frequency 
dependence in each pixel to a separate cubic spline. This involves 
even more fitting parameters (between 6 and 11), as the resulting 
curve is forced to match the data exactly at all observed frequen- 
cies. Maps of Of, 13 and 7 can then be produced by computing the 
first and second derivatives of the spline function. 

Figure [3] illustrates the pros and cons of the above-mentioned 
methods. Both the simple power law and the log-log quadratic poly- 
nomial are seen to provide poor fits simply because the physics is 
more complex than these functions can model. The figure shows 
that a power law is a poor approximation even in the synchrotron- 
dominated regime i^ > 1 GHz, because of a distinct steepening 
of the spectrum towards higher frequencies. However, the figure 
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Figure 4. The eigenvalues A^/ll for the 11 principal components, which 
can be interpreted as the fraction of the total variance at the 1 1 frequencies 
that each component explains. 

shows that going to the opposite extreme and allowing too many fit- 
ting parameters causes problems as well, from overfilling the data. 
The spline blindly goes through all the data points without any re- 
gard for what constitutes physically reasonable behavior, and sure 
enough is seen to perform poorly when forced to extrapolate. The 
ability to extrapolate reliably is crucial to our sky model because 
many of our input maps have only partial sky coverage. A related 
drawback of the spline approach is that if one of the input maps has 
more noise or systematic errors than others, this will fully propa- 
gate into the model rather than getting "voted down" by the other 
input maps. 

A final problem, seem most clearly in the bottom panel, is 
caused by fitting the log of the temperature rather than the tempera- 
ture itself: a relatively modest error in the predicted log-temperature 
translates into an exponentially amplified error in the temperature 
itself. The logarithmic fitting also complicates the modeling of 
measurement errors: if they are symmetrically distributed around 
zero and uncorrelated with the sky signal in the raw input maps, this 
is no longer true for the log-maps. In contrast, a linear combination 
of the linear input maps would preserve such desirable statistical 
properties of the noise. 

3.2.3 Principal Component Analysis 

The above examples suggest that we should try a method that: (1) 
can fit the spectral behavior of the data with as few parameters as 
possible; and (2J is linear (takes some linear combination of the raw 
input maps). In other words, we want a linear fitting method where 
the data itself is allowed to select the optimal parametrized form 
for the frequency dependence. Fortunately, the standard tool known 
as Principal Component Analysis (PC A) does exactly this (92). In- 
deed, we find that PCA performs better than all the approaches tried 
above when we implement it as described below. 
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Table 2. The three first principal components. 
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We begin by estimating the 11x11 matrix of second moments 



(7) 



by averaging over all of the ripix pixels i that were observed in all 
11 frequencies (the sky region marked as "123456" in Figure I2FI 
Therefore, the quantities 



•,1/2 



(8) 



simply give the rms fluctuations at each frequency, and the correla- 
tion matrix 

C-jk 



R 



(9) 



corresponds to the dimensionless correlation coefficients between 
all pairs of frequencies; — 1 ^ Rjfc ^ 1 and R_,j = 1. Defining 
the normalized maps Zi as the input maps rescaled to have rms 
fluctuations of unity at each frequency, R is simply the matrix of 
second moments of these normalized maps. 

We then diagonalize the matrix R, performing a standard 
eigenvalue decomposition 



R = PAP*, 



(10) 



where P is an orthogonal matrix (P*P = PP* = I) whose 
columns are the eigenvectors (principal components) and Aj^ = 
JjfcAj is a diagonal matrix containing the corresponding eigenval- 
ues, sorted in decreasing order. The eigenvalues \i are plotted in 
Figure |4l and the first three principal components are listed in Ta- 
ble |2] and shown in Figure [5] In this same table we also show the 
rms of the of each of the frequency maps calculated in the region 
123456 (second column). 

To help intuitively interpret this decomposition. Figure |6] 
shows maps of the first few principal components. Each princi- 
pal component map ai is defined as the dot product of the corre- 
sponding eigenvector with the normalized multi-frequency vector 
Zi for each pixel. For each pixel i, we can therefore transform back 
and forth between the normalized multi-frequency vector z; and 
the principal component vector a using the relations 



If we had also removed the mean of each map in this region (an issue to 
which we return latter), C would simply be the covariance matrix between 
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Figure 5. The frequency dependence is plotted for the first three princi- 
pal components, labeled by black dots, blue squares and red triangles, re- 
spectively. The top panel is in units of the total rms fluctuations at each 
frequency, whereas the middle panel shows the sky brightness temper- 
ature divided by (v/2.& GHz)^^-^] to keep all frequencies on roughly 
the same scale. The bottom panel shows typical spectra of various phys- 
ical components (8): synchrotron oc (long-dashed black), free-free 
oc u~^'^^ (dotted magenta), spinning dust (long-dashed green) and thermal 
dust (long-dashed red). It also shows half the sum (black dots) and differ- 
ence (blue squares) of the first two components, which are seen to behave 
roughly as synchrotron (with a spectral index that steepens toward higher 
frequency) and a sum of free-free, spinning and thermal dust (blue curve), 
respectively. 



— P Zj, Zi — Pa. 



(11) 



The principal component maps can be thought of as a division of 
the information in the 1 1 input maps into 1 1 mutually exclusive and 
collectively exhaustive chunks. They are mutually exclusive in the 
sense that they are uncorrelated: 



((P*z)(P*z)*> = P*(zz*)P = P*CP = A. 



(12) 



They are collectively exhaustive in the sense that they together 
specify the multifrequency information completely through equa- 
tion i ll It . Moreover, Figure|4]shows that almost all this information 
is contained in the first few principal components. Taking the trace 
of equation l llOt shows that 



11 

^ Aj = tr A = tr AP*P = tr PAP* = tr R = 11, 

i=i 



(13) 



the 1 1 frequencies. 



since the diagonal elements of the correlation matrix are all unity. 
In other words, the total variance to be explained in the normalized 
multifrequency data is 11, with a contribution of unity from each 
of the 1 1 normalized input maps, and equation l ll2t shows that the 
j"^ principal component map explains a variance \j, i.e., a fraction 
Aj/ll of the total. 

Figure|4]shows that the first component (top panel of Figure[6]l 
explains 80% of this total variance, the second component explains 
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Figure 6. The first three principal components, which can be crudely in- 
terpreted as maps of total "stuff" (top), synchrotron fraction (middle) and 
thermal dust fraction of non-synchrotron emission (bottom). The color scale 
corresponds to Ig(r/1K) in the top panel, and sinh~^ (T/IK)/ In 10 in 
the other panels to handle negative values (since sinh~^ x/ In 10 ~ Ig |a; 
for a; ^ 1 and for large positive and negative values, while it is roughly 
linear near zero). 



another 19%, the third explains another 0.6%, and all the remain- 
ing eight components combined explain merely the last 0.3%. This 
is very convenient: we set out searching for a way to accurately 
parametrize the frequency dependence of the radio sky with as few 
parameters as possible, and have found that as few as two parame- 
ters capture more than 99% of the information. 

Although principal component analysis is quite a standard 
data analysis technique our analysis also includes some non- 
standard procedures, tailored for the particular challenges that our 
global sky modeling problem poses: 

• We diagonalize R rather than C. 

• We perform no mean removal. 

• We make up for missing data by fitting to only the best princi- 
pal components. 



• We perform frequency interpolation by splining Ig ai and the 
component spectra. 

Let us now explain each of these procedures in more detail. 

Diagonalizing R rather than C corresponds to using the nor- 
malized maps rather than the raw maps as input for the PCA. We 
made this choice because we are equally interested in providing a 
good fit (in terms of percent of rms explained) at all 1 1 frequen- 
cies. If one took the raw maps as input, the PCA would instead 
focus almost entirely on optimizing the fit to the lowest frequency 
maps, since the increase of synchrotron temperature towards lower 
frequencies causes them to have by far the largest rms signal mea- 
sured in Kelvin. This usage of the normalized maps also has the 
advantage that the spectra of the dominant physical components 
become rather gently varying functions of frequency, which makes 
them much easier to linearly fit (see Figure[5j. This eliminates the 
need for logarithmic fits and their above-mentioned problems. 

In a standard PCA, one diagonalizes the covariance matrix 
((z — (z))(z — (z))'). We instead diagonalize the matrix (zz'), 
i.e., do not subtract off the mean from the normalized maps before 
computing their second moment matrix. This is because, as quan- 
tified in Section|4l this procedure makes the method more accurate 
in regions with incomplete data: whereas the principal components 
from the region with 1 1 frequency data work well across the en- 
tire sky (basically, because they reflect underlying physical emis- 
sion mechanisms which are the same everywhere), the 1 1 mean va- 
lues from this region are not at all representative for other regions, 
as they depend strongly on Galactic latitude. By not removing the 
mean, we also exploit the physical property that none of the dom- 
inant foreground components can ever contribute a negative inten- 

sitB 

Whereas a standard PCA can be performed in the region 
shown in Figure [2] where all 1 1 frequencies have been observed, 
we wish to build a global sky model covering the entire sky. For- 
tunately, we have rrii ^ 6 measured frequencies available every- 
where, and have found that much fewer than 6 parameters are re- 
quired for an excellent fit. We therefore take the best m, principal 
components determined in the region with complete data and fit 
them to the data available. In Section |4] we will explore what is 
the best choice of m, by quantifying the accuracy attained using 
1 ^ m* ^ 5 components. We perform this fitting by modeling the 
observed data in a pixel with rrii observed frequencies as 



Zi 



(14) 



where the tildes indicate that we are truncating to only m» compo- 
nents: Pi is the rrii x m, -dimensional matrix containing the m* 
first principal components as its columns, is the m, -dimensional 
vector corresponding to the first m, principal component maps (see 
Figure[5), Zi contains the nii normalized input maps that have data 
for this pixel, and the residual models random noise from both 
measurement errors and additional principal components not in- 
cluded in the fit. We perform this fitting separately for each pixel i 
by minimizing 



which gives the solution 



(15) 



^ The only sky signal with a non-CMB spectrum that can give a negative 
temperature contribution is the thermal SZ effect, and it makes a rather neg- 
ligible contribution compared to the synchrotron, free-free and dust compo- 
nents. 
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Table 3. Relative rms error in the sky region 123456. 



V 


Optimal 




Principal components used 




Unexplained 


[GHz] 




1 


2 


3 


4 


5 


fraction 


0.010 


0.062 


0.543 


0.078 


0.072 


0.065 


0.066 


0.00387 


0.022 


0.036 


0.450 


0.064 


0.060 


0.039 


0.038 


0.00126 


0.045 


0.035 


0.438 


0.046 


0.046 


0.038 


0.038 


0.00121 


0.408 


0.034 


0.379 


0.044 


0.044 


0.039 


0.039 


0.00115 


1.420 


0.111 


0.386 


0.135 


0.135 


0.150 


0.196 


0.01235 


2.326 


0.075 


0.235 


0.084 


0.083 


0.081 


0.137 


0.00562 


23 


0.015 


0.463 


0.058 


0.026 


0.026 


0.026 


0.00024 


33 


0.006 


0.504 


0.086 


0.009 


0.008 


0.008 


0.00004 


41 


0.009 


0.519 


0.089 


0.017 


0.015 


0.015 


0.00008 


61 


0.018 


0.542 


0.023 


0.023 


0.023 


0.023 


0.00033 


94 


0.057 


0.538 


0.225 


0.059 


0.059 


0.059 


0.00328 



a, = [P-N-^P,] 'P^N-^z,. (16) 

We describe our choice for the "noise" covariance matrix Ni = 
(nn*) in Section|4] 

Let us summarize the above steps: we first find the principal 
components using the sky region with data at all 1 1 frequencies, 
then use the frequency dependence of these best m, components to 
fit for maps of their amplitudes across the entire sky. This leaves us 
with an all-sky model predicting the emission at the 1 1 frequencies. 
However, we also wish to predict the emission at any frequency be- 
tween 10 MHz^ ^ 100 GHz. We do this by fitting the frequency 
dependence of both Igcrj and each of the m, principal compo- 
nents with a cubic spline as a function of Ig v. This works well 
only because, as seen in Figure|5] these are smooth, slowly varying 
functions. In contrast, it is notoriously difficult to perform useful in- 
terpolation of matrices, e.g., R, without wreaking havoc with their 
eigenvalues and physical behavior. 

4 RESULTS 

Above we have described how we construct our GSM. However, 
how accurate is it, and what does it teach us? 

4.1 Accuracy of our GSM 

4.1.1 Accuracy in the fully mapped region 

Table [3] shows the accuracy of our GSM in the sky region where 
we have data at all 1 1 frequencies. As described in Section lSTl we 
measure how accurately each map can be predicted by the other 
maps. Specifically, for each of the 1 1 frequencies, we compute the 
difference between this map and the map predicted by using only 
information from the other 10 maps, then tabulate the rms of this 
difference map divided by the rms of the observed map. Figure |7] 
illustrates this procedure for the more challenging all-sky case to 
which we return below: for these two example, the relative rms 
error is the rms of the bottom panel divided by the rms of the cor- 
responding top panel. 

Not surprisingly, Table[3] shows that using two components is 
much more accurate than using only one, reducing errors by almost 
an order of magnitude at some frequencies. Adding a third com- 
ponent is seen to further improve the accuracy, although not by as 
much, and mainly in the 20-40 GHz range. Adding a fourth compo- 
nent produces only minor gains, and reduces the 94 GHz accuracy 



ever so slightly, and adding a fifth component makes the accuracy 
noticeably worse at three frequencies, suggesting that we are be- 
ginning to overfit the data just as with the above-mentioned cubic 
spline approach. 

It is interesting to compare these accuracy numbers with what 
information theory tells us is the best possible case. The most ac- 
curate prediction for a given map using a linear combination of the 
others is easily computed using a standard linear regression anal- 
ysis {52), and these optimal errors are listed in the second column 
of Table [3] A popular statistical measure of how useful something 
is for predicting something else is the fraction of the variance that 
it explains. Under the heading of "unexplained fraction", we there- 
fore also tabulate the fraction /j of the map variance that is not 
explained by the other maps; this is simply the square of the rms 
residual, since the maps are normalized to have total variance of 
unity. Note that there is no need to actually perform a linear re- 
gression to compute these optimal numbers, as a simple derivation 
shows that they can be computed directly from the correlation ma- 
trix: 

The results of this analysis are very encouraging. Table [3] 
shows that the residuals achieved by our GSM with three compo- 
nents are very close to these smallest possible ones, which means 
that we need not worry about having overlooked some alternative 
modeling method that does much better. The results also raise an 
important question: if linear regression is so good, why do we not 
use it instead of our GSM? The answer is that we cannot: regression 
only works when the matrix R is known, and we can only compute 
R when we already have data at the frequency that we are trying to 
model. In other words, whereas we can use regression for accuracy 
testing, where we already know the answer, it does not help us with 
modeling all the unobserved frequencies between 10 MHz to 100 
GHz. We did explore the idea of predicting the R-matrix entries 
corresponding to new frequencies using interpolation, but were un- 
able to obtain useful results. In contrast, our GSM is straightfor- 
ward to interpolate to other frequencies, because we simply need to 
interpolate the spectra plotted in Figure|5] 

When we presented our GSM method described in Section[3] 
there were two details that we never specified: the choice of noise 
covariance matrix N in equation and the choice of m,, the 
number of principal components to use. Let us now discuss these 
two choices in turn. 
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408 MHz 



33 GHz 



.Observed 




Figure 7. Accuracy of our model at 408 MHz (left column) and 33 GHz (right column). The top row shows the observed data (the Haslam and WMAP 
Ka-band maps), the middle row shows the maps predicted by our 3-component GSM without using the observed map above it, and the bottom row shows the 
observation minus the prediction (which is visually indistinguishable from zero for the 33 GHz case, because the residuals are less than 1% in and around the 
Galactic plane). 



4.1.2 The noise covariance matrix 



The "noise" is simply the residual signal in a map that we are un- 
able to predict using the other maps, so it will contain contributions 
from both measurement errors in the input maps and sky emission 
mechanisms modeled with inadequate precision. Both of these con- 
tributions are captured by the remaining principal coinponents not 
included in the fit, which according to equation l llOt make a con- 
tribution to R that is PAP* except with all eigenvalues from the 
included components set to zero. However, it is easy to show that 
adding noise for the included components has no effect on the solu- 
tion of equation l |16l l, so we get exactly the same result if we simply 
set N = PAP* = R. As a reality check, we also tried the alter- 
native approach of setting N equal to a diagonal matrix with the 
optimal variance values from Table[3]on the diagonal, and obtained 
very similar results. 



4.1.3 Accuracy across the entire sky 

How many principal components should we include to maximize 
the GSM accuracy? To determine this, we must quantify the accu- 
racy not only in the best-case sky region where we have complete 
data, but also over the rest of the sky as well, since we ultimately 
care about the whole sky. Table|4]shows how this sky-averaged ac- 
curacy depends on the number of components used. Specifically, 
we have computed the relative rms error just as in Table[3] but se- 
parately for each of the 10 sky regions show in Figure|2] then com- 
puted their average weighting by sky area. The numbers show that 
the two best choices are 3 and 4 components. However, whereas 
these two choices were essentially tied in the fully observed region, 
m, = 3 comes out slightly ahead in the all-sky average because it 
is twice as accurate at 1.42 GHz. This means that, although the 
3-component model is slightly less sophisticated and therefore typ- 
ically slightly less accurate, it is also more robust and less likely 
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Table 5. rms sky signal in K for regions of different cleanliness 



[CjHz] 


1 


2 


-cleaner) re^ 
3 


^ion (dirtier- 
4 


5 


6 


0.010 


203740 


272337 


304115 


328310 


281838 




0.022 


27336 


41972 


63535 


98153 


118713 


130600 


0.045 


5486 


8347 


13019 


21285 


31287 


35926 


0.408 


20.0 


30.0 


52.0 


103.3 


182.9 


230.4 


1.420 


0.744 


1.021 


1.614 


3.016 


5.356 


6.839 


2.326 


0.150 


0.238 


0.487 


1.184 


2.196 


2.760 


23 


0.000098 


0.000260 


0.001106 


0.004140 


0.010357 


0.015078 


33 


0.000036 


0.000097 


0.000435 


0.001693 


0.004343 


0.006444 


41 


0.000021 


0.000056 


0.000255 


0.000996 


0.002569 


0.003851 


61 


0.000007 


0.000024 


0.000117 


0.000433 


0.001091 


0.001639 


94 


0.000006 


0.000022 


0.000106 


0.000332 


0.000758 


0.001078 



Table 4. Relative error averaged over the entire sky 



V 




Principal components used 




[GHz] 


1 


2 


3 


4 


5 


0.010 


0.438 


0.091 


0.098 


0.088 


0.168 


0.022 


0.690 


0.164 


0.144 


0.142 


0.138 


0.045 


0.712 


0.110 


0.109 


0.111 


0.100 


0.408 


0.436 


0.112 


0.115 


0.127 


0.134 


1.420 


0.546 


0.143 


0.144 


0.698 


0.875 


2.326 


0.216 


0.148 


0.155 


0.158 


0.503 


23 


0.423 


0.082 


0.062 


0.062 


0.064 


33 


0.453 


0.077 


0.013 


0.013 


0.014 


41 


0.458 


0.071 


0.032 


0.032 


0.031 


61 


0.444 


0.069 


0.068 


0.068 


0.070 


94 


0.385 


0.223 


0.121 


0.121 


0.160 




Figure 8. Our subdivision of the sky into six regions of decreasing cleanli- 
ness. From outside in, they correspond to WMAP-based junk map temper- 
atures T < 100/iK, lOO^tK - SOOajK, SOOajK - ImK. ImK - 3mK, 
3mK — lOmK, and T > lOmK, respectively. 



to go badly wrong in unusual parts of the sky. For this reason, we 
focus on the 3-component model in the rest of this paper. 

Table|4]shows that the typical accuracy is around 10% for the 
frequencies below WMAP, and noticeably better for the four low- 
est WMAP-frequencies. It is striking that the 33 GHz accuracy is 
as good as 1.3%, which means that if WMAP had not made this 
particular map, as much as 99.97% of the sky variance at this fre- 
quency could have been predicted by the other maps (and this is 
not even counting the CMB signal, which we subtracted off at the 
outset). Also, as shown in the results for the WMAP 94 GHz map, 
the amount of noise in a map clearly worsens the accuracy to which 
we can reconstruct it. 



4.1.4 Accuracy at different Galactic latitudes 

It is difficult to quantify the accuracy of our GSM in a meaning- 
ful way with a single number, since the sky signal varies so dra- 
matically with Galactic latitude: any measure of absolute error (in 
Kelvin) will therefore be dominated by the inner Galactic plane, 
while any measure of relative error will tend to be dominated by 
the cleanest regions where the signal-to-noise ratio is the poorest. 
To get a more nuanced picture of how accurate our GSM is, let 
us therefore quantify the relative errors separately for the regions 
shown in Figure [8] which subdivide the sky into six parts of in- 
creasing Galactic emission. They were defined in JT^ by comput- 
ing the four differences of WMAP maps at neighboring frequencies 



(to subtract out the CMB), computing the largest absolute value at 
each pixel, and making a contour plot of the resulting "junk map". 
From outside in, the regions correspond to junk map temperatures 
r < 100/iK, lOO^iK - SOO^iK, 300/iK - ImK, ImK - 3mK, 
3mK — lOmK, and T > lOmK, respectively, so the typical sky 
signal differs by about half an order of magnitude between neigh- 
boring regions. Table |5] shows that this scaling is roughly valid at 
all the WMAP frequencies, and that the differences between dirty 
and clean regions become less extreme towards lower frequencies. 
Although we of course do not have complete frequency coverage 
across the entire sky, comparing Figure|2]with Figure [8] shows that 
we are lucky to have coverage at all frequencies somewhere within 
each of the six sky regions of Figure [8] with the only exception that 
the very dirtiest region is not observed at 10 MHz. 

Table |6] summarizes how the accuracy of our 3-component 
GSM depends on both frequency and Galactic signal level. At the 
sub-GHz frequencies relevant to 21cm tomography, we see that the 
accuracy is typically ^ 10% or better in the cleanest parts of the 
sky, and degrades in the inner Galactic plane. For the higher fre- 
quencies relevant to CMB research, the situation is the opposite: the 
accuracy is best in the dirtiest parts of the sky (as good as 1% at 33 
and 41 GHz), but degrades in the cleanest regions. This is clearly 
due to the fact that detector noise is non-negligible at the higher 
WMAP frequencies, so that the lower the signal is, the lower the 
signal-to-noise level and the accuracy. Future WMAP data releases 
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Table 6. Relative error as a function of sky cleanliness 
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C 
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0.045 
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U.U /2 
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0.187 
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0.180 


0.157 


0.129 


2.326 


0.147 


0.160 


0.158 


0.165 


0.170 


0.167 


23 


0.111 


0.070 


0.083 


0.073 


0.062 


0.050 


33 


0.062 


0.022 


0.013 


0.012 


0.011 


0.011 


41 


0.091 


0.052 


0.042 


0.039 


0.034 


0.028 


61 


0.270 


0.067 


0.071 


0.082 


0.079 


0.067 


94 


0.766 


0.164 


0.095 


0.124 


0.136 


0.135 



are therefore likely to further improve the accuracy of our GSM at 
CMB frequencies. 

Finally, it is important to remember that the errors in our 
downloadable GSM are likely to be even smaller than the tables 
above suggest, because a map used as "truth" in a test may itself 
have noise and systematic errors, and also because it uses all 11 
input maps jointly, not merely 10 at a time. For example, one can 
clearly make vastly better predictions near 408 MHz than Table |6] 
suggests if the Haslam map is included in the modeling. 

4.2 Implications for our input maps 

An interesting byproduct of our modeling effort is an independent 
quality assessment of the 11 input maps. If two maps are highly 
correlated, this implies that none of them can be afflicted by large 
noise or systematic errors, which would have spoiled the correla- 
tion. More quantitatively, the unexplained variance fraction listed 
in Table |3] places an upper bound on the total contribution from 
detector noise and systematic errors in a map. If we focus on its 
square root, the optimal rms column in the same table, we see that 
the lowest frequency WMAP maps give the lowest residuals. This 
is not surprising, considering that in order to meet its CMB science 
goals, WMAP had to be designed with significantly stricter sys- 
tematic error control than typical in radio astronomy. As mentioned 
above, the WMAP increase in residuals with frequency reflects the 
drop in foreground signal while detector noise remains important 
and roughly constant. 

At the lower frequencies relevant to 21 cm tomography, we see 
that the 10-408 MHz map errors can be at most at the 10% level in 
the cleaner parts of the sky (see Table |6l(, and no more than 6% 
in the region where we have full frequency coverage (see Table [3] 
column 2). The remaining radio maps (at 1 .42 GHz and 2.326 GHz) 
have error bounds about a factor two higher. 

Finally, there is one kind of systematic error that our modeling 
cannot detect: an overall position-independent calibration error in 
a map. Because this would not affect the dimensionless correlation 
coefficients with other maps, it would not affect our goodness-of-fit 
either, merely cause corresponding calibration errors in the predic- 
tions. 

4.3 Physical interpretation of our GSM 

The goal of this paper is simply to model the Galactic emission, 
not to understand it physically. However, since our statistical results 



automatically encode interesting physical information, let us briefly 
comment on possible interpretations. 

4.3.1 Component interpretation 

A number of physical components of Galactic emission in our fre- 
quency range have been thoroughly discussed in the literature, no- 
tably synchrotron radiation, free-free emission, spinning dust and 
thermal dust. However, we should not expect these physical com- 
ponents, which tend to be highly correlated, to match our principal 
components, which are by definition uncorrelated. We should ins- 
tead expect our first principal component (top panel in Figure|6]l to 
trace the total amount of "stuff", and the remaining principal com- 
ponents to describe how the ratios of different physical components 
vary across the sky. The frequency dependence seen in Figure |5] 
confirms this. The first component is shown to contribute an essen- 
tially constant fraction of the rms at all frequencies, corresponding 
to Ai/11 f» 80% of the total variance. 

The second component, which explains another A2/II ~ 
19% of the total variance, is seen to have the negative of a 
synchrotron-like spectrum below a few GHz, and a spectrum at 
higher frequencies that is suggestive of a sum of free-free emis- 
sion, spinning dust and thermal dust. This suggests that this com- 
ponent encodes what fraction of the total emission is due to syn- 
chrotron radiation. Sure enough, the second panel in Figure |6] is 
seen to be negative in the north polar spur region which is known 
to be dominated by synchrotron emission, and positive in regions 
like the inner Galactic plane and the Large Magellanic Cloud where 
one expects higher fractions of non- synchrotron emission. 

The third component, which explains two thirds of the remain- 
ing variance (and A3/II ~ 0.6% of the total variance), is seen 
in Figure |5] to have a spectrum that looks like thermal dust at the 
high end, goes negative below that, and essentially vanishes below 
a few GHz where synchrotron radiation becomes dominant. This 
suggests crudely interpreting it as encoding what fraction of the 
non-synchrotron signal is due to to thermal (vibrational) dust emis- 
sion. It is unclear whether the 10 MHz blip in its spectrum is a 
fluke or reflects a physical correlation between dust properties and 
low-frequency synchrotron properties like self-absorption ( l93h . 

4.3.2 Synchrotron and non-synchrotron templates 

As we discussed before, we do not expect our principal components 
to correspond directly to physical components, because the former 
are by definition uncorrelated while the latter are not ("stuff traces 
stuff", and there tends to be more of everything at low Galactic 
latitudes). However, it is interesting to ask whether we can form 
linear combinations of our principal components that have a simple 
physical interpretation. 

Interestingly, we can. In Figure |5] we see that taking the sum 
and difference of the first two principal components (from the sec- 
ond panel) gives components whose spectra look distinctly like 
what is theoretically expected for synchrotron and a combination of 
the other emission components, respectively (as seen in the bottom 
panel). First of all. Figure |5] (bottom) shows that the two new tem- 
plate spectra are approximately non-negative at all 1 1 frequencies. 
This is a non-trivial result, since generic 11 -dimensional eigenvec- 
tors or combinations of them will have both significantly negative 
and significantly positive components — in contrast, we know that 
neither synchrotron, free-free nor dust emission can be negative. 
Second, the same figure shows that first template, which we will 



© 2008 RAS, MNRAS 000,[T]{T6] 



12 de Oliveira-Costa et al. 




Figure 9. Our synchrotron (top) and non-synchrotron (bottom) templates are the sum and difference of our first two principal components, where the color 
scales coiTesponds to lg(T/lK). Labels indicate bright objects in our Galaxy such as supernova remnants (Cas A, North Polar Spur, Loop III), an emission 
nebula (Gum Nebula), giant molecular clouds (Orion A, R Corona Australis, the Ophiucus Complex, W3) and an active star-forming region (Cygnus Region) 
as well as bright extragalactic sources like giant elliptical galaxies (Virgo A, Fornax A), radio galaxies (Centaurus A, 3C84) and quasars (3C273, 3C279). 
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hereafter refer to as our synchrotron template, has a spectral index 
/3 ~ — 2.5 at low frequencies, gradually steepening towards higher 
frequencies just as expected for synchrotron radiation (1861: i87l") . In 
contrast, the second template, which we will refer to simply as 
our non-synchrotron template, is seen to have a spectrum such that 
v^'^Iiy) rises toward higher frequencies, and can be fit by a sum of 
free-free, spinning dust 1 194) and thermal dust emission. The corre- 
sponding sky maps (the sums and differences of the first two prin- 
cipal components) are shown in Figure |9] 

In other words, our spectral results are consistent with the 
interpretation that the top panel of Figure |9] is a diffuse syn- 
chrotron template while the bottom one is a template of diffuse non- 
synchrotron emission. As expected, known supernova remnants 
subtending large angles (Cas A, North Polar Spur and Loop III) are 
prominent in the synchrotron template, while diffuse dusty sources 
like the Cygnus Region stand out in the other template. However, 
we cannot make any such interpretations for the point sources that 
appear in the paper. This is because some point sources were re- 
moved from some of the low-frequency radio maps we used, which 
can fool our analysis into removing them from the synchrotron tem- 
plate. For example, in the 22 MHz map that we used, areas around 
the strong point sources Tau A, Cas A, Cyg A and Vir A. have 
been blanked. At 1420 MHz, only Cas A was blanked. Although 
it would be useful to repeat our analysis with new versions of the 
input maps where point sources have not been removed (or where 
they have been reinserted using measured fluxes), the present paper 
of course has the opposite focus: our key purpose is to model the 
diffuse emission for use in the cleanest parts of the sky, which are 
the ones most relevant to 21 cm tomography and CMB observa- 
tions. 

It is worth emphasizing the blind nature of our analysis: 
by simply forming those two unique linear combinations of our 
two dominant principal components for which the spectra were 
non-negative, our approach discovered the synchrotron and non- 
synchrotron spectra in the data using no physics input whatsoever 

4.4 Angular resolution options 

To be able to use all II of our input maps, our spectral model- 
ing has been performed at 5.1°, the lowest common denominator. 
If we make the approximation that the spectral shape, but not its 
amplitude, varies only slowly across the sky, then we can create a 
higher resolution global sky model by locking the amplitude to a 
higher resolution input map. For example, for each pixel, we can 
rescale all three principal components used by the same constant, 
chosen such that the prediction at 408 MHz matches the full resolu- 
tion Haslam map. This procedure is illustrated in Figure|9l the top 
panel locks to the 1° Haslam map (recommented for applications 
below 1 GHz where synchrotron dominates) while the bottom panel 
locks to the WMAP 23 GHz map smoothed to 2° to suppress de- 
tector noise (recommended for applications at CMB frequencies). 
These 1°, 2° and 5.1° versions of our GSM are all available on the 
above-mentioned website. Figure [Tolshows examples of our output 
maps at 5.1° resolution. 



5 CONCLUSIONS 

We have presented a global sky model for 10 MHz to 100 GHz 
Galactic emission derived from all publicly available total power 
large-area radio surveys, digitized with optical character recogni- 
tion when necessary and compiled into a uniform format. Both 



our data compilation and software for returning a predicted all-sky 
map at any frequency from 10 MHz to 100 GHz are available at 
,http: //space.niit.edu/honie/angelica/gsm 

We found that a PCA-based model with only three compo- 
nents can fit the II most accurate data sets (at 10, 22, 45 & 408 
MHz and 1.42, 2.326, 23, 33, 41, 61, 94 GHz) to an accuracy 
around l%-10% depending on frequency and sky region. We found 
that using these three principal components comes very close to the 
maximal accuracy allowed by information theory, with the added 
advantage of allowing robust frequency interpolation and some 
physical interpretation. The fact that our model has so few fitting 
parameters in a given spatial direction also makes it rather robust 
to the input data: a map with lots of noise or systematic errors will 
have smaller correlations with other maps, and therefore and get 
"voted down" by the other maps and given less weight. 

Strong correlations between different physical emission me- 
chanisms would explain why such accurate fits are possible with 
fewer principal components than known physical components: one 
rapidly counts beyond three when including free-free emission, 
spatial variations of the synchrotron and dust spectra, etc. 

We have focused entirely on unpolarized Galactic emission. 
To help maximize the future scientific impact of 21 cm tomogra- 
phy experiments, it will be important to extend this work to both 
extragalactic point sources and polarized emission. Since these ex- 
periments will provide a gold mine of cosmological information 
buried by under ^ 10^ times larger foreground signals, this should 
be well worth the effort! 
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